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The present report extends the method of fixed point clustering |Q by intro- 
ducing an indirect criterion for the number of clusters. The derived probability 
function allows an objective distinction of clustered data and data in between 
clusters. Applications on simulated data illustrate the clustering method and 
the probability function. 
PACS numbers: 02.60.-x, 45.10.-b 

1 Introduction 

The dynamics of spatially extended systems can be measured by sets of multi- 
detector arrays. Most spatio-temporal analysis methods fitting multi-dimensional 
dynamical models j^, |^, ^, ^, ||, ^ consider data over the full time range. In 
a method was described for partitioning spatio-temporal signals into time seg- 
ments, in which the signal can be modeled by deterministic ordinary differential 
equations near fixed points. Each dynamical system is determined by a non- 
linear spatio-temporal analysis The earlier proposed algorithm in works 
with an arbitrary number of clusters k. Since results are dependant on fc, an 
objective criterion for the number of clusters is necessary. The present report 
introduces a different segmentation algorithm and aims to derive an indirect 
criterion for the number of temporal segments. 

In the present report, we use the K-Means algorithm |l^ for segmenting 
data, which addresses each data point to a single cluster. Since K-Means works 
with an arbitrary number of clusters and this number is crucial to clustering 
results, we derive a probability function representing the degree of membership 
of a data point at time i to a cluster. It addresses data to clusters or transition 
parts between clusters and hence determines the number of necessary clusters. 
Applications to simulated non-stationary data illustrates the probability mea- 
sure. 

2 Fixed Point Clustering (FPC) 

In the following, a signal trajectory is assumed as compound of a sequence 
of segments governed by saddle point dynamics. Under the hypothesis, that 
these segments comprise the main functionality of the underlying system, we 
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aim to extract them from the signal. Trajectories approach saddle points along 
their stable manifolds whereas they leave the vicinity of the fixed points along 
the unstable manifolds. The signal points accumulate close to the fixed points 
if the signal is sampled at a constant rate. This accumulation may also be re- 
garded as a point cluster in data space. Subsequently, stable manifolds in multi- 
dimensional signals lead to point clusters and their detection can be treated as 
a recognition problem in data space . 



The clustering algorithm 

A A^-dimensional spatio-temporal signal can be described by a data vector 
q(t) 6 5R^, where the component qj{ti) represents a data point at time i and de- 
tection channel j. The clustering algorithm aims at cluster centers {k^}, whose 
mean Euclidean distance to a set of data points q(ii) is minimal. The presented 
implementation follows Moody et al.jl^ and is sketched in Fig. |l|. 
Cluster centers k" are initialized at random locations in the data and their 
Euclidean distances to each data point are calculated. K-Means defines mem- 
berships of data points to a cluster by the smallest Euclidean distance to its 
center. Thus, data are segmented into k clusters and new cluster centers k^ are 
calculated as means of clustered data points. Distances between data points 
and centers k" are re-estimated until a convergence condition is fulfilled. This 
criterion can be set either as a upper Euclidean distance limit between sequen- 
tial cluster centers k",k"+^ or as number of iterations. We choose to limit the 
number of iterations to 25. 



Simulated spatio-temporal data and results 

Now, a low-dimensional simulated signal A(t) is introduced describing ampli- 
tudes of multi-dimensional spatial patterns by 

i 

This superposition describes a spatio-temporal signal q{t). 
The dataset A{t) is generated by 

ii = eAi - Ai [Aj + (2 + b)Al + (2 - b)Al] + T{t) 
A2 = eA2 - A2 [Aj + {2 + b)Al + {2- 6) A}] + r(t) (1) 
= eAs - A^iAj + (2 + b)Aj + (2 - b)Aj] + r(i). 

Parameters are set to e = 1, 6 = 2 and r(<) gJ— 0.05. ..0.05] represents additive 
noise following a uniform deviate. Equations describe the convection onset of 
a Rayleigh-Benard-experiment in the presence of rotation[0 |ll], p^ . 
A 3-dimensional trajectory A{t) is calculated by 2200 integration steps with the 
initial condition A{t = 0) = (0.03, 0.2, 0.8), see Fig. El The trajectory passes the 
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saddle points A!] = (0,0,1), A? = (1,0,0) and A§ = (0,1,0) in this sequence, 
and then returns to Ag. 

The K-Means algorithm is applied on the simulated data for different number 
of clusters k = 2, .., 7. In Fig. ^, the Euclidean distances from each data point 
to the determined cluster centers are plotted in temporal sequence for each k. 
When a trajectory approaches or moves from a cluster center, its Euclidean 
distance to the center decreases resp. increases. These changes can be observed 
in Fig. ^. For fixed number of clusters, each data point is considered to be 
member of a cluster, whose center is closest to the data point. 
Comparing obtained clustering results for different fc, clustered time windows 
[0;-350], h350;~1050], [~1160;~1610] and [-1740;2200] are recognized, which 
borders remain similar for different k. 



3 The cluster criterion 

Although there might be only a limited number of clusters fc^ < fc in the data, 
K-Means determines k clusters also including void clusters. In Fig. ^, small clus- 
tered time windows are visible, whose occurences and temporal widths strongly 
depend on k. They are considered as invalid clusters. Conversely, a first quali- 
tative criterion for valid clusters may be formulated as: 

• cluster widths and locations in time remain independent of k and 

• the Euclidean distances of clustered data points to centers is obviously 
smaller then the Euclidean distances of points to the next nearest cluster 
center and 

• the width of the clustered time window is not too small. 

Although these criteria are rather heuristic than formal, they proved to be 
useful in practice 0. Now, we try to evolve them quantitatively. The first item 
can be formulated as a sum over all clustering results: valid contributions are 
additive if they occur for all fc, others vanish in the sum as small contributions. 
Thus the contribution of a valid cluster to the sum should be large, not reliable 
clusters should contribute with small values. A good quantity for these contri- 
butions is the area between the curves of the signal-nearest cluster-distance and 
signal-next cluster-distance. This definition allows the analytical formulation of 
the second item and is outlined in Fig. |[ Each data point ti obtains an index 
corresponding to the cluster j it is member of. The index is equal the relativ 

area — — where T denotes the number of data points. By summing up the 

indices over K cluster realizations for every data point, a degree of membership 
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M{t) for every data point is obtained: 




M{U) 



M{t) represents a probability, that a data point at time t belongs to a cluster. 
The application to simulated data with if = 30 leads to results of M{t) shown 
in Fig. ^. Clustered time windows can be recognized as regions of high values of 
M{t). Four plateaus oi M{t) are recognized at [0;250], [340;1010], [1180;1560] 
and [1750;2200] with borders at drastic value changes. Regions between these 
plateaus are considered as non- functional transitions parts. Comparing time 
windows in the original signal(Fig. ^) and detected clustered time windows in 
Fig. 1^ and Fig. good accordance of time windows near fixed points and cluster 
results are recognized. 



4 Conclusion 

The present brief report extends the fixed point clustering method by intro- 
ducing a probability function M{t). High values of M{t) indicate clustered 
data points. Fixed point clustering relates temporal dynamics near fixed points 
showing attractive and repelling properties with clusters in dataspace. By the 
presented extension, regions in data space near such fixed points can be deter- 
mined independant of the number of clusters. Applications to spatio-temporal 
signals in hydrodynamics, metereology or brain science are possible. 
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Figure 1: The implementation steps of the K- Means algorithm. 
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Figure 2: Trajectory of the 3-dimcnsional signal A{t). It starts near a saddle 
point and passes two others, before it returns to the initial saddle point. The 
numbers denote the timesteps of the trajectory at their locations. 



Figure 3: Cluster results for k = 2, .., 7. The Euclidean distances between data 
points and detected clusters are shown. 



Figure 4: Sketch to illustrate the introduced criterion of a clusters validity. Area 

Aj between two distance curves indexes the data points, which belong to cluster 
j. Large areas indicates at a high degree of membership. 



Figure 5: Degree of membership M{t) for every data point as a sum of X = 30 
clustering results. Plateaus denote valid clusters, which are delimited by rapid 
changes. 
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